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1 Introduction 



This article deals with an empirical Bayes modeling approach (by which is meant 
latent ability random sampling in the IRT context) to the item response theory (IRT) 
modeling of psychological tests. Suppose we randomly sample N persons from a 
specified population, and then administer a test consisting of n items. The data 
structure for a randomly selected examinee can be expressed by a random vector 

( Xi, . . . ,Xn,6), 

where Xi,...,^„ denote item responses and 6 denotes examinee abiHty, which is 
unobservable. Abstractly, in an empirical Bayes problem the data is modeled by 
independent identically distributed (i.i.d.) random vectors 

(xl'^ x<'>,«,), {x\'\...,xl^K0,),.., (xr',...,xOT<»„). 

One important measurement goal is the estimation/prediction of each examinee's 9. 
Clearly one should use the first examinee response X[^\ X^^^ to predict the actual 
value of 01, However, unless the distribution of 6 is completly specified, there is useful 
information in 

{x\'\...,xi% (A:f,...,xf)),...,(xr' xn, 

the second through A^th examinee responses, about the unknown distribution of 6 and 
thus about the unknown ability in particular, which we want to estimate. Thus an 
alternative approach to using only {X[^\ X^^^) is to use all of the test responses in 
making inferenses about ^i. 



Let Xj be the score for a randomly selected examinee on the jth item; Xj = 1 if 
the answer is correct, Xj = 0 if in correct, and let 

1 with probabiiity Pj{9) 
0 with probability 1 — Pj{0) 

1 



where Pj{0) denotes the probability of correct response for a randomly chosen exam- 
inee of ability 9^ that is, 

Pi{e) = p{Xi = m, 

where $ is unknown and has the domain (—00, 00) or some subinterval on (-~oo, 00). 
We make two assumptions about the IRT models of this paper: 

(a) Local Independence (also called Conditional Independence) 

= flP{Xi^xi\e} 

(b) Monotonicity: each Pj{0) is strictly increasing in $. 

Lord (1980) makes an interesting remark about the existence of a prior distribution 
for ability: 

"/n work with published testSf it is usual to test similar groups of ex- 
aminees year after year with parallel forms of the same test. When this 
happens, we can form a good picture of the frequency distribution of ability 
in the next group of examinees to be tested.^ 

This suggests taking an empirical Bayes approach to IRT modeling, in particular 
assuming partial knowledge about the distribution of $ and thereby being able to 
make efficient use of the response data to make inferences about the distribution of $ 
and thus make inferences about the unobservable examinee abilities. The distribution 
of a test response Xi,...,Xn is indexed by 6, which belongs to the parameter space 
0; that is, each ^ € 0 governs a test response distribution. Let Ln{0) denote the 
log-likelihood, that is 

Ln{0)::=log{Pn{Xu....Xn\e)}. 



If we assume that the prior distribution has density U{6), according to Bayes' theorem, 
the posterior density for each given 

( Xi,...,X„) = ( Xi, Xn) 

can be written as 

""(^1 - Pn(x.,...,X„) 

. exp{Ln{e)}n{$ ) 

where 

Pn( Xi,...,Xn)= / Pn{xu...,Xr,\e)U{e)de. 

Notice that, the "prior" and "posterior" refer to the relationship between the 
distributions and the observation xi, . . . ,Xn. E.g., IT(^) is prior to Xi, . . . and 

Un{0\ Xi,...,Xn) 

is posterior to xi,»».,Xn. These ideas can be easily extended to the study of the 
asymptotic behaviou" of *he posterior distribution. In particular, for each 
what can be said about the posterior probability of ^ as n tends to infinity? 

It has long been part of the IRT folklore / under the usual empirical Bayes 
unidimensional IRT modeling approach, the posterior distribution of $ given test 
response is approximately normal for a long test. Holland (1990) indicates: 

^At present I know of no through discussion of the asymptotic posterior 
normality of latent variable distributions and this would appear to be an 
interesting area for further research/^ 

In classical statistics, when ( Xi , . . . , .Yn) are i.i.d., an important result (informally 
stated) is that, for n large, the posterior density lln{0\ Xi, . . . ,^n) is approximately 



equal to the normal density yV(tf„,aJ), where On is the maximum-likelihood estimator 
(or MLE) of e and al = {-L'^iSr,)}' , where I'^iOn) is the second derivative with 
respect to 6 of the log-likelihood evaluated at On* On and here are functions of 
( A'l, . • . , Xn) only. Intuitively, — ► 0 in applications, usually like 1/n. 

Lin lley(1965) proposed a heuristic approach to prove the above result by expand- 
ing the log-likelihood in Taylor series in 0 about On^ 

A 

where i2n is a remainder term. Since the log-likelihood hcts a maximum at On the first 
derivative vanishes there. As shown above the posterior density viewed as a function 
of 6 for fixed xi, . . . , Xn is proportional to 

U{e)exp{Lr.{0)}. 

Therefore, 

Un{e\ xi, . . . ,xn) oc U{e)exp{LM - ^^^r^ + ^l- 

A 

Since Ln(On) does not involve ^, it may be absorbed into the omitted constant of 
proportionality so that 

nn(^| Xu . . . ,Xn) OC n(g)exp{- ^^ "Jf + i^n}, (2) 

^ n 

where the remainder, jR„, is claimed to be negligible when compared with the other 
term in (2). Because (tJ — » 0 like 1/n, the density in (2) becomes concentrated at 
0n in the limit, thus allowing n(^) to also be absorbed into the omitted constant of 
proportionality. Thus, 

Un{B\ xi,...,x„) a expl-^^-^Tj^} 
4 



as desired. However, Lindley (1965) did not give a rigorous proof. 



Walker(1969) proved that under certain conditions, the posterior probability of 
On + a^n < ^ < + ban, namely 

fBn+ban 

1 n„(^|Xi,...,x.Me, 

converges in probability Pe^ to 

Ja 

as n — ► CO. Here, as the notation Pe^ indicates, in the generation of Xi^...^Xn 
we assume is the true value of 0. That is Xi, . . . is generated according to 
the distribution Pn{ xi,. ..,Xn|^o)- Then, using the rules of conditional probability 
coniputation, it is easy to shew that one way to interpret Walker's result is that 

converges in probability to 

Ja 

as n oo. That is, for each fixed (but unknown) Oq we have an asymptotic confi- 
dence interval for each choice of a < 6. 

As we know, for all realistic applications, the item characteristic curves are not 
identical. Therefore, the {Xj} we have are merely independent, conditional on 6^ but 
not identically distributed. However, the general IRT model enables us to prove, by 
adapting the approach that Walker (1969) applied to i.i.d. random variables, 

(a) The "weak" convergence, that is, for — oo < a < 6 < oo, 

An= I n„(^|Xi,...,Xn)rf^ 



converges in probability Pe^ to 

/l = (27r)-^/2 l\-Wdy 

Ja 

as n -H^ oo. That is, 

^^o{M*n - ^1 < 1» a5 n 00, for arbitrary c> 0. 

(b) The strong convergence of An* that is, 

Pe^{ lim An = i4} = 1; 

(c) Convergence in ^'manifest" probability, or ^^^o free" convergence, that is. An con- 

verges to A in the manifest (or marginal in the sense that is integrated out) 
probability P, which is defined, for any fixed n 

P{{ Xi,.,.,Xn) = ( Xi,...,Xn)} 

=^ / Pn{^u.^^.xn\e)ir{e)de. 

Je 

This result is also easily mterpretable as hd asymptotic confidence inteval for 
ability. That is, it assures that 

converges in probability to 

Ja 

as n — » oo. That is, for any randomly sampled examinee, we have an asymptotic 
confidence inteval for each choice of a < 6. Here in (c), in contrast to (a), the 
value of 0 for the randomly sampled examinee is not fixed. 

(d) The weak and strong consistency of the MLE On, which are intermediate results 

in the proofs of (a) and (b). 

Proving (a)-(c) is the main purpose of this paper, thereby meeting the Holland 
challenge quoted above. 
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2 Further Notation and Assumptions 

2.1 Basic Notation 

^o: The true parameter. In saying that Xj is a random variable we infer that Xj has 
the density 

for some fixed value of 0. Denote this value by ^o, which we refer to as the true 
parameter. 

On'. The Maximum Likelihood Estimator(MLE) of 6, which is defined as a solution 
(in general non-unique), of 

P„( Xi, . . . , XM = max{Pn( Xu... ,Xr,\e)], (3) 



if it exists, or equivalently, of 



Ln(l9n) = max{Ln(^)}. (4) 

^6" 



lj{0): The item information function of item j, which is equal to 

where P'j{d) is the first derivative of Pj{9) with respect to 6. 
The test information function 

7(")(^) = f;7,(5). 



* 7 



{/(")(^n)}"\ (5) 

noting that our definition of al used hereafter in the paper differs from the often 
used arl =^ {"^'ni^n)} ^ mentioned above. 

7 

n 



Xj{0): The logit function of item j 

A,(«) = Mi^||^}- (6) 

2.2 Regularity Conditions 

Some "regularity'' conditions and their explanations will be stated before going 
into details about our theorems. Fix G 0: There are five bztsic assumptions: 

(Al): Let ^6 0, where 0 is (—00,00) or a bounded or unbounded interval in 
(—00,00). Let the prior density n(^) be continuous and positive at 6q^ where 
^0 is assumed be the true value of 6. 

(A2): Pj(6) is twice continuously diffeientiabln and Pj{6) and Pj{6) are bounded in 
absolute value uniformly with respect to both 6 and j in some closed interval 
No of ^0 G 0. 

(A3): For every fixed 0 ^ 60^ assume foi some given c{6) > 0 

'^n-'J^EoM^) < -c{e) (8) 

and 

sup|Aj(5)| < 00. 
j 

(See Footnote^,) Note that 



^For a sequence of rea! number {a„}, if lim„-,ooar» does not exist, then {on} must have more 
than one limit point. /imn-ooOn denotes the largest limit point (or upper limit). 



8 



(A4): {l'j{0)} and {X"j{9)} and {X'j'iO)} are bounded in absolute value uniformly in 
j and inO e No, No specified in (A2) above. 

(A5): 

liminf^^^^ >c{eo) >0. 

n-»oo fi 

That is, asymptotically, the average information at is bounded away from 0. 

Although 0 may be (-00,00), we always arsume without loss of generanality that 
^0 is contained in a finite interval, e.g. [-a, a] for some fixed a > 0. This is because 
from the psychometric viewpoint, taking var{6) =1 for convenience, the same edu- 
cational decision is made about people with 6 = A and people with 6 = 24. Thus, 
assuming — 5 < ^ < 5 does no practical damage. 

The condition (8) of assumption (A3), perhaps, looks unfamiHar. But it plays 
an important role in the proof of Lemma 3.1 below, ensuring the identifiability of 
^0. That is, when ^0 is the true value of 6, E{Ln{e) - I„(0o)} should be sufficiently 
negative for all values oiB^Bo . In other words, this condition allows us to "identify" 
^0 by maximizing the likelihood function. (A3) acts as a remedy in the case that (Xj) 
are merely independent but not identically distributed. In other words, if they are 
i.i.d., as is the case in Walker's proof, then (A3) is automatically satisfied. To see 
this, note in the i.i.d. case that 



Note that 



Ee,exp{Zm = ^(^o);^ + (1 - ^^(^o))^^^ = 1- 

Thus, since -logx is strictly convex, Jensen's inequality (Lehmann, p50) shows that 
for arbitrary 9 

EMS) = E,„\log{Y(e)}] < log{E,,[Ym} = 0, (10) 
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where 

Y{e) = exp{z,{e)}. 

Thus (8) is satisfied by taking 

c{e) = -EeAZi{e)}. 

Unfortunately {Zj{0)] in IRT models are not identically distributed, so we have to 
impose some supplementary condition. According to (10), n"^ 12%i ^Oo^ji^) will be 
negative, however, this does not enable 13 to obtain (8). For what classes of IRT 
models then does (8) hold? Consider the caise in which each E0QZj{6) satisfies, for 
some c(^), 

£;,„z^(^)<-c(^)<o. (11) 

It is obvious that (8) holds. However, this condition is stronger' than needed. It would 
suffice to merely require that a "certain proportion^^ of the E$QZj{6)s satisfy 
condition (11)^ say one in every K, no matter how large the K is. Mathematically 
speaking, this would imply 

and so 

n-' EooZjie) < -'c{e) < 0. 

Actually, (8) does not seem very restrictive in IRT models incurred in practice. As 
evidence, consider a "typical" IRT model of 40 3PL items, in which the item parame- 
ters are precalibrated from a real ACT math test. The graphs illustrated in Figure 1 
are the EeQZj{$)s computed from this model. Clearly (8) seems to be holding. 

(A4) and (A5) are used to make L'^iO) behave sufficiently well for 6 near ^o- Con- 
dition (A5) implies that the test information function evaluated at tends to infinity 
with the same speed as n. These five conditions would not be difficult to verify in 

10 

''4 



Figure 1: EeA^Mh for 40 items, ACT-MATH Test (Drasgow, 1987). 
particular applications and hence are really fairly mild modeling assumptions. 



3 The Main Theorems 

In this section we will introduce three theorems and the major steps of the proof 
cf Theorem 3.1, the basic theorem. The rigorous proofs of these theorems, as well as 
their related lemmas and corollaries, are contained in an appendix. 

3.1 Convergence in Probability 

Theorem 3.1 Suppose that conditions (Al) through (A5) hold. Let 0^ be an MLE 
ofOo, and a„ be the square root of {/^"H^n)}"*. Then, for -oo < a < b < oo, the 
posterior probability of $n + aa„ < 0 < ^„ + ban, namely 




, . . . , 
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tends in Pe^ to 

Ja 



as n 00. 



Theorem 3.1 is the basic result in our asymptotic posterior normality work. Note 
that An is a random variable depending on Xi,...,Xn. Thus its distribution is 
determined by the parameter and An A in Pe^ means 

Jirn P(9o{|>ln - >l| < e} = 1, for arbitrary t > 0. 
Outline of Proof. To prove the theorem, write 

I . n„{e\Xu...,x„)de = - 

^ G / Pn{Xu...,Xn) y 

Pn{ Xi,...,Xn\On)crn \Pn{ Xi, . . . , Xn\^n)^n / 



where 



and 



G = u{e)K{Xu....Xr,\e)de, (i2) 



J Q 

It suffices to prove 

Pn{ Xi,...,Xn) _^ (27r)^/^n(^o) 
Pn{ --^li • • • iXn\6n)crn 

as n — > 00, in Pe^ , and 



Pn{ ^l," ",Xn\On)&n 

as n -> oo, in Pe^, where ^{x) - {2ir)~^^^ J!^^ e"^"'rfti. 



(13) 



^ {2ny^'U{eo)ma) - $(6)} (14) 
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In the following we will present the general idea to prove (13). ((14) is proved by 
the similar method.) First expand Ln{6) at On by Taylor expansion: we have 

LniB) - uk) = ^^^<(^:) 

^^"^"^V-i2n), (15) 



2al 

where 6^ is a point between 0 and On, and is defined by (5) and Rn is defined by: 

= {<(^:) + /<"n<^n)}//<"H<^n). (16) 

Split Pn{ Xi, . . . , X„) into two parts as follows 

P„(Xi,...,A'„) = / n(^)Pn(Xl,...,Xn|^)rf^ 
+ / n(^)Pn( A'i,...,Xn|^)rf^ 

J\e-eo\<s 

''^ Gi + G2. (17) 
Therefore, recalling that Ln{0) = logPn( • • • » ^nl^)? 

= ea:p{L„(^o)-In(^n)}{/<"n<?n)}^/' 



X I n{e)exp{Ln{e) - Ln{eo)]de (is) 

J\e-eo\>6 



and, using (15), 



..,Xnl^n)(Tn V^'oKfi n(^o) ' 2<t2 



P„(Xi,...,Xn|l9„)(rn V^o|<6 n(^o) 

Thus, if 



P„( A^i,...,Xn|^„)(^„ 

and 



0 in Pe, (20) 



P„( Xi,...,Xn|^„)<5r„ 

13 
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- ^ {2n)'f'n{eo) inPe,, (21) 



then (13) holds. For establishing (20), first consider (18): If l9„ is consistent then 
exp{Ln{Oo) ^ Ln{On)} gocs to a constant as n approaches oo< On the other hand, since 
{/<**>(tfn)}^/^ approaches oo like n^/^ we need to make Ln{0) - In(^o) **sufSciently 
negative'' so that the integral of (18) approaches 0 faster than n"^/^ and hence the 
left hand side of (20) can be neglected outside the 6 region of ^o- As for establishing 
(21), consider (19): Since n(^) is continous, n(^)/n(^o) will be close to one for 6 
sufficiently small, and we need to make Rn "sufficiently small" inside the 6 region 
so that we can estimate the integral by 

J\0-eo\<6 2al 
Mathematically speaking, we need the following two lemmas. 



Lemma 3.1 Suppose that conditions (Al) through (A3) hold. For any 6 > 0, there 
exists k{6) > 0 such that 

Vim PoA sup n"MM^)-M^o)]<-A:(6)} = l. 
""^"^ \e-eo\>6 

Lemma 3.2 Suppose that conditions (Al) through (A5) hold. Then 

Ue)-LM = {e-L?L"M)n = -^~^{\-Rn\ (22) 

A 

where ^* is a point between 0 and 6n, and Rn is defined by (16). Also, for any e > 0, 
there exists 6 such that 

\imP{ sup \Rn{e, Xu...,Xn)\<£} = 1. (23) 
\e-eo\<6 

A 

As a by-product, Lemma 3.1 ensures the consistency of the MLE 6^^ which is 
labeled as Corollary 3.1. 

A 

Corollajry 3.1 Suppose that conditions (Al) through (A3) hold. Than 6^ is weakly 
consistent, namely 

lim On = ^0 in Pe^. (24) 



n-*oo 
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It can be shown that (22) of Lemma 3.2 makes it possible for us to use the 
reciprocal of the test information as the variance estimate (see (5)), instead of 

as Lindley (1965) and Walker (1969) each sug,?ested. The variance estimate (5) we 
have chosen has the following advantages: 

• The information function ) is always positive. -L^i ), by contrast, could 
be negative, especially when the sample size is not large enough. So, some times 
{~L'^{ )y^^ may not exist. 

• The information function is easier to calculate, while the calculation of L'^{ ) is 
more complicated. 

Future study should be undertaken to compare the speed of the convergence and to 
explore any further advantages. 

3.2 Convergence Almost Surely 

As discussed in the preceding subsection, the posterior distribution for de- 
rived from a proper prior density n(^), converges in probability to the standard 
normal distribution. In this subsection we will see that a stronger result, conver- 
gence almost surely, (also referred to as strong, almost everywhere, or with 
probability one convergence), can be achieved under the same assunlptkms. 

Theorem 3.2 Suppose that conditions (Al) through (A5) hold. Let $n be an MLE 
of$o, and he the square root of {/<"H^„)}"^ Then, for -oo < a < b < oo, the 
posterior probability of On + a&n < 0 < 6n-\- b&n, namely 

An= . nn{e\Xu....Xn)de, 

Je„+ac„ 



15 

^,9 



tends to 




du almost surely^ 



C.S n — > 00. 

What is the difference between the conclusions of Theorem 3.1 and Theorem 3.2? 
It is instructive to look at the following two statements which are equivalent to these 
two theorems respectively: 

• The sequence {An} is said to converge in probability to A if and only if for 
each c > 0, 

\}mPeA\An-A\>t}^0, 

or equivalent ly 

\\mPeA\An-A\<t} = l. (25) 

• The sequence {A„} is said to converge to A almost surely (or in probability one, 
strongly, almost everywhere, etc.) if and only if, for each c > 0, 

lim PooiTmLX \Am - A| < e} = 1. (26) 

Since (26) clearly implies (25), we have the immediate conclusion that Theorem 3.2 
implies Theorem 3.1. 

In order to have a better understanding about convergence almost surely, it 
is interesting to quite the following example by Stout (1974, p9): 

^In statistics there are certain situations where almost sure conver- 
gence seems a more relevant concept than convergence in probability. Con- 
sider a physician who treats patients with a drug having the same unknown 
cure probability ofp for each patient. The physician is willing to continue 
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use of the drug as long as no superior drug is found. Along with admin- 
istering the drug, he estimates the cure probability from time to time by 
dividing the number of cures up to that point in time by the number of 
patients treated. If n is the number of patients treated^ denote this esti-^ 
mating random variable by X^n)* Suppose the physician wishes to estimate 
p within a prescribed tolerance c > 0. He asks whether he will ever reach a 
point in time such that with high probability^ all subsequent estimates will 
fall within e of p. That is^ he wonders for prescribed 6 > 0 whether there 
exisis an integer N such that 

P{max|j^(„)-p|<£}>l-6. 

The weak law of 'arge numbers says only that 

P{|^(„) -p| < e} -» 1 as n-*(X> 

and hence does not answer his question. It is only by the strong law of 
large numbers that the existence of such an N is indeed guaranteed." 

3.3 Convergence in Manifest Probability 

Perhaps it may seem confusing to some readers to simultaneously have 6 fixed 
at ^0 and have 0 be a random variable governed by 11 (^), as is the case in Theorems 
3.1 and 3.2. Thus some sort of clarification seems needed. The idea that leads to the 
adoption of the notation 6o is the following: For any given response vector 

= ( Xi, . . . , Xn), 

if it comes from a randomly selected examinee we can always assume that he or she 
has specific ability , say ^o- However, in most cases is unknown but hypothetically 
specified. Under this assumption, the distribution of A'i,...,X„ is induced by Oq. 
On the other hand, the given xi, . . . ,x„ can also be interpreted just as a pattern. 
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Our interest is to know the proportion of examinees in the population who would 
produce response vector xi, . . . ,Xn. Denote this proportion number as 

P{(Xi,...,Xn) = (xi,...,x„)} (27) 

and call H the manifest probability. It is clearly that 

P{(Xi,...,;^n) = (xi,...,Xn)}>0 

and 

J2 P{( A'i,...,Xn) = (xi,...,Xn)} = 1. 

Since we know the prior density n(^), (27) can be obtained by integrating the joint 
probability with respect to 0^ that is 

p{( , . . . , Xn) = ( xi, . . . , x„)} = / Pn( xi , . . . , xn\e)u{e)de. 

Je 

According to Theorem 3.1, 

['^''^^^UMXu....X^)d9 ^ $(a)-$(6) (28) 

in probability Pq^. It is very interesting to notice that the right hand side of (28) is 
free of Oq^ which suggests that we can further prove that the convergence is "free of 
^o". Since (28) holds for ^every" Oq^ intuitively speaking, it should be true that (28) 
holds under the "average of ^os" . Therefore, we ought to be able to substitute the 
manifest probability P for Pe^: 

Theorem 3.3 Suppose that conditions (Al) through (A5) hold. Let 9n be defined by 
(3) or (4), and an be the square root of {/^"H^n)}~^- Then, for -oo < a < 6 < oo, 
the posterior probability of $n + a&n < ^ < + 6(7n, namely 

tends to 



/ n„(^| A:i,...,Xn)rf^, 



Ja 
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in manifest probability P. 

Summarizing the last few paragraphs, Theorem 3.1 implies that the asymptotic 
posterior normality holds for any randomly chosen examinee with ability ^o- On 
the other hand, Theorem 3.3 ensures that this asymptotic property holds for any 
randomly sampled examinee from the population. In other words, one is sampled from 
the subpopulation and the other is sampled from the whole population. Therefore, 
Theorem 3.3 has more general meaning. (The original idea of Theorem 3.3 was 
proposed by Brian Junker in personal conversation with one of the authors.) 

4 Conclusions 

The asymptotic posterior normality of latent variable distributions has been es- 
tablished under very general and appropriate hypotheses. This result has (at least) 
two important implications. First, it provides a probabilistic basis for assessing ability 
estimation accuracy in the long test case. Second, it provides an important first step 
in making rigorous the Dutch Identity conjecture (Holland, 1990), which, roughly 
speaking, claims that only 2 parameters per item are required in order to obtain good 
long test model fit for unidimensional test data. 

Further, the consistency of MLE of 6 has been discussed. It is very interesting 
to mention that our proof of the consistency of the On is very similar to the Wald's 
proof(1949) for the Xi,...,Xn i.i.d. case. It is worth remarking that the general 
IRT model (that is, non identically distributed responses) yields as powerful asymp- 
totic results as the i.i.d. model - the favorite model of most statisticians, which has 
so many good qualities. 



19 



Finally we should indicate that for general multidimensional IRT models the 
asymptotic posterior normality can be proved for the random vector 6 given test 
response Xi, , . . , X„, under suitable regularity conditions. 
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Ap;nendix: Proofs of Main Theorems 

In this appendix we will prove the results introduced in Section 3. 



A The Proof of Convergence in Probability 

The proof of Theorem 3.1 is based on Lemma 3.1, Lemma 3.2, and Corollary 
3.1, Before going to the proofs , two important theorems, from real analysis and 
probability theory respectively, should be introduced here: 

Theorem A.l (Heine-Borel covering theorem) (BiUingsley, p566) 
If[a,b] C n^i(afc,6/t), then [a, 6] C n);^i(a/t, 6/t) for some n. 

Remark: Equivalent to the above theorem is the assertion that a bounded, closed set 
is compact. 

Theorem A.2 (Strong law of large number (Serfling, p27)) 

Let Xi,X2,... be independent with means /xi, /i2, ...and variances a\ , a-i If the 

series converges, then 

n n 

n"^ 5^ Xj - A'j 0 with probability one. 

Proof of Lemma 3.1: 

Remark The proof of Lemma 3.1 is an improvement over Walker's result, which only 
covers the i.i.d. case. The strategy used in the proof can be described by two steps: 

(a) to prove, for any ^ 9o, there exists 6i > 0 such that 

limPflo{ sup n-'lLn{0)-Ln{eo)]<-Ci{6i)}=:l. 
"-'^^ \9-e.\<6, 

We put the subscript i here because we only need finite number of such dis. 
2 A set C is defined to be compact if each cover of it by open sets has a finite subcover - that is, 
if [Ge : ^ € 6] covers C and each Gj is open, then some finite subcollection {G«,, ...,G«„}covers C. 
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(b) to use Theorem A.l to cover {\0 — Oo\> 6} DC , where C ii> a compact set, by a 
finite number of open sets \6 — 6i\ < Si, i=:l,..,m. 

For any 6 ^ Oq^ recailing from (7), the definition of Zj{d)^ and (9), it follows that 

n-'[Ln{e) - LniOo)] = n-' Zj{e). (29) 

Now, from (7), 

EooZm = ^iWlog{^} + [1 ~ ^;(^o)]log{^f^}. (30) 

In order to apply Theorem A»2 to {Zj{6)}^ we need to estimate var{Zj{0)). Writ- 
ing Zj{6) using logit function (see (6)), 

ZAO) = A'^(A;(^)-A^(^o)] + log{^f^}, 

it follows that 

var{Zj{e)) = var{Xj)[Xj{e) - Xj{eo)Y 

= Wo)(l-/';(^o))[A;(0)-A,(^o)]^ 

Since, for any fixed 6^ Aj(^) is bounded in absolute value uniformly in j (assumption 
(A3)), this implies that there exists a constant 0 < M{6) < oo such that 



|i;ar(Z^(^))| < M{9) for all j, 

and hence 



f < ^. (3.) 



Thus we can use the law of large numbers to get 



n-' E ~ n-' X; EeoZjid) 0 wp\ . (32) 



From (29), (32) and assumption (A3) it follows that 



P{\\mn-'lLn{e) - LniOo)] < -c{e) < 0} = 1 (33) 
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for some c{9) > 0. 



Suppose No is the closed interval assumed in condition (A2). For any fixed O' € 
NoCQ and for any 0 satisfying {6 - $'{ < 6, define Hj{0\e) by the following: 

Since Pj{6) is strictly increasing in 0, Pj{6') = 1 and Pj{6') = 0 can be ruled out. 
Hj{d\d), as a continuous function of ^, will achieve a maximum value over [9' - 6, O' -\- 
6]. Denote this maximum value as Hj{S,0'), that is, there exists 0^^ G [9' -8, 6 +6) 
such that 

Hj{6,e') = Hj{0^'''^^'\e')= max {Hj{e\0)}. (34) 

Clearly, for each j 

WmHAS.e') = 0. 

Now we have 

nog{Pi{ef^[i - PM'-'''} - ^og{Pi{o'f'[i - PA^')Y~'''}\ 

= |AVog{;gJ^} + (l-^i)log{|f;^}| 

< nog{|5^}K|log{|£^}| (35) 

= Hi{e\e) < H,{6,e') (36) 

We shall now prove that [Pj{B)] is equicontinuous^. From (A2), P'j[9) is continuous 
and bounded in absolute value uniformly in j and m 0 ^ Nq. By the mean value 
theorem, 

\m - PAQ')\ = l^;(C;)(^ - < Cp\0 - &'\ for all j, (37) 



function P defined on (-cx), oo) is said to be equicontinuous if, given f > 0, there exists a 
number ^ > 0 such that \x' - x"\<6 implies \P{x') - P{x")\ < f for all x, x" . 
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where (j is a point between 6 and 6* for each j, and (p =supj {|Pj((j)|} which is finite. 
Let 6 = €/(p for c > 0, then 

if\e-d'\ < 6, \P^{d) - Pj{d')\ < £ for all j. 

Recall that 9* here is any fixed point in A^o- Note that 

Since Pj{0) is strictly increasing in 6 , 

and 

Therefore, 

From the equicontinuity of {Pj{6)}, for arbitrary c > 0, there exist a sufficiently small 
6 > 0 such that 

'^°^^W^"<4 1-P,(^') ' ^ ? 

where either 6' = 6 or -6. Thus, for all n and for all 6 sufficiently small 



Therefore 



lim n-^y;^i(*i^') = 0 as 6 0. (38) 



n— ►oo 
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We shall now prove that for any 6i ^ ^o, there exists a sufficiently smc'l 5, > 0 
and sufficiently small Cj > 0 such that 

lim P{ sup n-'lU{e) - Ln(M < = 1- (39) 
\e-e,\<6. 

For ^ 6 : 1^ - < 6}, according to (29),(7), and (36), 

n-'[Ue) - Lnieo)] = n-'[LM) - Lr,{6o)] ■¥ n-'[Ue) - LM)] 

So we have 

n 

sup n-'[U{e)-LM] < n-'lUOi) - Lr,{0o)] + n-'Y.^A^M 
\e-0,\<6 i=i 

Substituting 6i for 0 in (33), we will have 

P{'^n-'[LM) - U{0o)] < -c{ei) = -Ci) = 1, (40) 
where c, is positive for all i, and from (38) we will have for all i 

\uii n-^ T Hj{6A) 0 as6 0, 

So there is an open interval \6 - Oi\ < 6i and a positive number c,, e.g. c, = |, such 
that (39) holds. 

Recall that in assumption (Al) 0 can be defined by two different domains. In the 
following, we will discuss these two cases respectively. 

Case 1: If 0 is a bounded closed subset of (-00,00), then Q- {9 :\e~eo\< 6} is 
compact, according to Theorem A.l it can be covered by finitely many, say m, 
such open intervals 

{Oi-SuOi-^-Si), (^2-^2,^2 + ^2), ....,(^m-^m,^m + ^m). 
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Define event A\^^ by 

aJ") = { sup n-'[Ue)-Lr,{eo)]<-Ci} (41) 
\e-0,\<6, 

From — » 1 for each i as n — » cx), we have 

P{nz^A^^^} 1. 

Now we replace Cj in (39) with 

k{6) = min{ci,C2, ,Cm}. 

Therefore, (39) holding for all i implies (24). 
Case 2: If 0 is not bounded, such as 0 = (-00,00), we will show 

lim P{ sup n-^[Lr,{0) - LM)] < -ca < 0} = 1 (42) 
for a sufficiently large positive number A. Now 

0-{^:|^-^o|<^}n{^:|<?|>A} . 
is bounded compact set, so finally we can get (24) from (42) by defining 

k{6) - min{ci,C2,....,Cm,CA}. 

To complete the proof, we have to prove that (42) is correct. Let \6^\ = A, rewrite 

supn-'lLr^{e)-L40o)] = n-'lLn{e^)-Ln{eo)] + sup n-^[Z,n(^) - Z,n(^A)], (43) 
\6\>£i \e\>^ 

where 

" " j=i n\0£^) ^ j=i i--rj(fA) 

Since = 0 or 1, and PjiO) is strictly increasing in 0, then for 6 > A, 
sup n-'[L,{e) - Ln{e^)] <supn-^f:iog-^, 
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and for ^ < -A, 

sup n-'lUO) - < sup n-' ~pfiv 

\e\>A e<-A j=i 

Since each item response function has horizontal asymptotes as 6 —* +00 and 6 —* 
—00, we can prove that 

n P(6) 

lim sup n"^ V log TTTTT ^ 

and 

hm sup n Vlog^ ;r-; — tt U 

as A — » 00. Therefore we have 

lim sup n-^[L„(^)-L„(^A)j 0 as A 00. (44) 
"■"•^ |(?I>A 

Substituting for 6 in (33), we have 

PC^n-'[Ln{eA) - LniOo)] < -ca} = 1. (45) 
Formulas (44) and (45) can be used to (43) to get (42). Therefore (42) holds. ■ 

Proof of Corollary 3.1: The MLE, if it exists, obviously satisfies 

WM-i.W = .og(^if^^}>0 (46) 

for all n and for all Xi, . . . , X„. It is sufficient to prove that for any c> 0 and 6 > 0, 
there exists N{e^6) such that 

Prob{\9„ - ^ol < <5} > 1 - £ for all n > N{e,6). 

Suppose 6n is not consistent, then there exist co and 60 such that, for any N there 
exists some n > N, 

Prob{\en -Ool > 60} > Co. 
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Therefore we can obtain a subsequence {tf„,} such that 



Prob{\$n, - Ool > So] >eo for all m. (47) 

Thus, 



eo < \\mProb{\en-eo\> 60} < Prob{ Vrni {\0n - Ool > 60]}. 
It is obvious that the event 



lim[|^n-^o| >i!)ol 

implies that for infinitely many n 

A 

sup [^n(^) ~ ^n(^n)] ^ 0 foT infinitely many n, 
\0~eo\>6o 

because ^ = <?n is a possible value. But then according to (46) the event 
sup [Ln{0) - Ln{Oo)] > 0 for infinitely many n 

\0'-0q\>6q 

has a probability greater than or equal to 60. This contradicts (24), which implies 
that for any c > 0, there exists N such that 

Prob{ sup [Ln{e)-Ln{eo)]>0} <t faraUn>N. 

\B^Bo\>6o 

This completes the proof. ■ 



Proof of Lemma 3.2: Without loss of generality, we first consider that 6n € 

A A 

[|^ — ^o| < <5] C A^o- Since the On is consistent, the probability of On being con- 
tained in the neighborhood of will be close to one, when n is sufficiently large. 

The second derivative of the log likelihood function can be written as 

Ll{e) = Ea;(^)[x,-p,(^)] - ±i,{d). (48) 
j=i j=i 
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To prove (48), first notice that it suffices to prove for n=l, that is 

Ll{e)^x;{e)[xi-Pi{e)\-h{e). m 

Note that 

Ii(^) = A'(^)Xi + log(l~Pi(^)), 

SO that 

z;'(^) = A;(^)Xi + [iog(i-A(^))]". 

Comparing this with (49) it remains to show that 

- [iog(i - p,{e))]" = x'l{e)P^{e) + h{e). m 

However by definition, 

h{e) = EeA-L'i{e)] = -x;{e)P^{e) - [iog(i - p^{e))]\ 

which is equivalent to (50). 



Consider the numerator of |/2„| : 

Kie:) + 7(")(i9n)l = I tl^-io:) - a;'(^o)][x, - p,{e:)] + ± x';{eo)[x, - p^iOo)] 

j=l 3=1 

+ i;A;(Wi(M-P;(^;)]+ t{^Aen)-im}\ 

< eia;'(<?;)-a;(^o)i (51) 
j=i 

+ ii:A;'(^o)[Xi-Pi(Mi 
+ ii:A;(^o)[P;(^o)-/'i(^; 

+ i:iWn)-w;)|. 
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A 

Note that 0* depends on 0 and On through the Taylor expansion and that the distri- 
bution of On depends on From (37) 

I E ^-{OomOo) - < K - OoHp. (52) 

Prom the meiin value theorem 

iA;ra-A;(^o)i = iA;'(e^))(^;-^o)i 

and 

iWn)-/irai==i/;(e^))(<?n-oi, 

wheic §1^'^^ is a point between 0^ and Oq^ and OH'^^ is a point between On and 0^. 
According to assumption (A4), the third derivative of the logit function, Aj"(tf), and 
the first derivative of the information function, /j(^), are bounded in absolute value 
uniformly in j and in 0^ therefore, 



Ei^>:)-'^I(M<i^;-^oKa, (53) 



and 



n 



T.M0n)-Wn)\<\en-e:\nO. (54) 

Note that i^ai and (^/ are finite positive numbers and they are independent of j. 
We shall now prove 

\tKmX,-PAeo)]\=^0,{n'f'). (55) 

(See Footnote '*.) Assumption (A4) ensures that {Aj(^o)} is bounded in absolute 
value uniformly in j. By Chebyshev's inequality, for some Af > 0, 

3=1 



^The notation of = Op{bn) means that a„ is bounded stochasticly by 6„ in probability, that 
is, On — 0|)(6n) if and only if for arbitrary c > 0 there exist Mi and iV^ such that 

P{|o„/6„| < M(} > 1 - e for all n > N^. 
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that is, for arbitrary c> 0, take K = [M/tY^^, then we have 

P{\j2>^'ji^o)[Xj-Pj{eo)\/n'f'\ <K} > 1-^ for all n 
that means we have (55). 



Formulas (52), (53), (54), and (55) can be applied to (51) to get 

\L:M + /(")(I9„)| < {1^; - ^ol + \k - e:\}nC + 0,{n'f% (56) 

where 

c = cp + Ca + o. 

We shall now prove 

lim P{ /<"H^„)/n > c/2 > 0} = 1. (57) 

By assumption (A4) 

n-V<"n^n) - /<"H^o)| < n-^X;|/,(l9„)-/,(^o)| 

< |<9„-^o|C/. (58) 
By using the consistency of On and (58), we get 

&\k)/n - /<"H5o)/n 0 in Pe, as n ^ oo. 
Thus, by assumption (A5), we have (57). 

From (56) and (57) we obtain 



sup 

\e-eo\<6 



me,x„...M\ < ^^sup^^l r'''\/w(#„)J 

= sup fcMt4zi^Uo,(n-'/'). 
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Note that 



K - L\ < K - Bo\ + \k - Bo\ and - 6o\ <\6 - 6o\ + A - ^o|, 

where the second inequality follows from the fact that ^" is between 0 and Bn- There- 
fore 



sup \Rr,{e, Xu...,Xn)\< sup 

\e-0o\<6 \e-eo\<6 



(3|l9n-go| + 2|g-go|)cl 

n ^ 



For any e > 0, choose 

A 

then we have (23), recalling that 0^ — » in Po^ and (36). 

> 

The above proof is based on the assumption that 0^ is in the neighborhood (^o — 
^1^0 + so we just proved that the conditional probability approaches to one: 

lim P\Un\Vn] =1, (59) 

where 

Un = { sup |/2n(^, Xx,...,Xn)\<t] 
\0-0o\<6 

and 

Since Corollary 3.1 implies 

Hm = 1, (60) 

n—*oo 

it is obvious that (59) and (60) implies lim„-oo P\Vn] = 1- Thus we finish the proof. 

■ 

Proof of Theorem 3.1: 

Remark: T/ie following proof will use a similar methodology as Walker's(1969). The 
proof itself will not use any assumption about i.i.d.. Instead, it will just depend on 
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the results of Lemma 3.1 and Lemma 3.2. 

As we discussed in section 3.1, it suffices to prove (13) and (14). To prove (13) it 
suffices to prove (20) and (21). Let us start with (20). Rewrite Gi as 

= Pn( A:,,...,Xn|i9„) / , n{e)exp{Lr,{e) - Lr,{er,)}d0 

= Pn{Xu...,XMexp{LM-Lr,{er,)} [ n(^)exp{Ln((?) - M^o)}^^. 
Since 6^ is an MLE, 

Ln(^o) - Z'n(<?n) < 0, (61) 

A 

and therefore exp{L„(^o) - ^n(^n)} < 1- So we have 

= {/<"H^n)}^/^ / ^ n(^)exp{L„(^) - LM}de 

Pn( Xi,...,Xn\0n)(Tn J\e-eo\>S 

= exp{L„((9o) - Z.„(l9.)}{/<"H<?n)}'/' / n(^)exp{Ln(^) - Lr,{eo)}dd 

< {/<")(<?n)}^/'Go, (62) 

where 

Go = / n(^) exp{Ln(^) - Ln{eo)}de. 
J\e-9o\>s 

By Lemma 3.1, for any 6 > 0, there exists k{6) > 0 such that 



where 



Define 



notice that 



limP(,<,{C/n} = l, 



t/„ = [ sup n-'[Lr,{d)-Lr,{0o)]<-k{6)<O]. (63) 

|e-tfol>« 



Vn = [Go < exp{-nfc(^)}]; (64) 



exp{-nfc(5)} / U{9)dd < exp{-nk{6)}. 

J\0~0o\>6 



i\e~eo\>s 
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Because UnQVn, we have 



Jim PeAGo < exp{-n/V(6)}} = 1. 

Since 

{/<")(^n)}*/^exp{-nA;(5)} 0 in Pe^, as n -> 0, 
it follows, (using (62)) 



lim ^ = 0 in Pe,. (65) 

"-«>Pn( A'i,...,Xn|^n)<^n 



Thus (20) holds. 



Now we prove (21). From (15), rewrite G2 as 

G2 = Pn{Xu...,XnK) f u{e)exp{K{e)-u{er,)}de 
= p„(x„...,Xn|^n)/ , u{e)exp{-^^^^{i-Rn)}de 

J\e-eo\<6 2<T* 
l\e-$o\<s l[{6o) 



= PA X,,...,XJ#„)nw/ £exp{-(^(l-7l.)H«. 



We shall now observe _ , „ ^\ ■ > 



. mexp{-(^(l-«„)W« (66) 



From condition (Al), in particular the continuouity of n(^), for any e > 0 we can 
choose 6 such that : |^ — ^o| < ^} C A^o and 



l_e< inf ^;^< sup ^<l+e. (67) 
-\e-eo\<sU{eo) - |<,„flo|<6 n(^o) " 



Then, using (66) 



il^«G, < ■ < ii±«G3, (68) 

-/'„(A'„...,A:„|tf„)ff„ - 
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where 

{0 - K) 



G,= l exp{-i^-:^(l - Rr,)]de. (69) 



For any e > 0, define 



Cn = [ sup \R^{e, Xi,...,Xn)|<£), (70) 
\6-Bo\<S 
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and 

^>n = l/ exp{-^^^(l + e)}rf^<G3< / exp{-i^^(l + e)}d^) 
J\e-eo\<6 2crl J\9-eo\<s Icrf^ 

(71) 

Now we should get rid of R^. Since C„ C D„, and for any e > 0, from Lemma 
3.2, 

lim PeACn) = 1, i'his implies lim PeADn] = 1. 

n— ♦oo 0 1*-' ' n— ♦oo 

That is, the probability of the event 

/ exp{-i^^(l + e)}d0 < G3 < / , exp{-^^^-(l - e)}dtf (72) 
J\e-eo\<8 2al J\e-eo\<6 2crf^ 

converges to 1 as n oo. Therefore, recalling (17),(65),(68), and (69), the only 
thing left to establish (13) is to observe that 

{0 - k? 

exp\— 

= (27r)V^(l + tT"''^n\HK\^o + f> - <?n)(l + e')^/^}-${^„-^(^o-6-<?n)(l+e')^/=}l, 

(73) 

where e' = tor -t. Since ^„ is consistent and C50 in probability, when e < 1, 

6q ^- 8 - On f> in Poo, 

00 - 6 - $n -6 in Poo, 
a-'iOo 6 - -\- n'^^ oo inPe,, 
^;'(^o-^-^n)(l + c')'^^ -00 inPe,, 
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So 

${a;-Vo + *- (?„)(! + eY^2} -4 1 inPo,, 

Therefore, the difference in the square brackets of (73) converges to unity in proba- 
bility. Since the t is arbitrary, this proves (13). 

Now we prove (14). First of all we consider (12) and (17) again: G and G2 are 
the same except for their rigions of integration: one is + a(T„, $n + b&n) and the 
other is{^ : \9 - 9o\ < 6}. For the same e and 6 given by (67), if + a(T„, On + 6<t„) 
is a subset of : |^ - < we must have 

1_,< inf M.< sup ^<l + e. (74) 

Define 

. En = lik + aoTn, On + b^n) Q {$ : \$ - f,\ < 6}]. 

A 

Since ^„ $o in Pe^ and (T„ -» 0 in P^g. Thus, 

■Ptfo(^n) 1 a5 n oo, (75) 

and hence the probability of (74) converges to 1 as n — ♦ oo. Consider (68) again. If 
{On + a<Tn, + f>^n) is a subset of : 1^ - ^o| < ^}) and if we substitute the rigions 
of integration of (68) by {9n + d^m + 6<5^n)) then the new inequality (76) below will 
still hold. 

(i^iHW)^; < ^ ■ < (iliWMc;, (76) 

where 

^3= A , exp{--i^,-^(l-/2n)}rf^. (77) 
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Because of (TS), the probability of the event indicated by (76) converges to 1 as 
n -> 00. For the same c given by (72) define 

C; = [ sup \Rn{e, Xu--'.Xn)\<e], (78) 

and 

(79) 

From (75) and EnCC'^C D'^, 

Peoi^'n) 1 as n 00. 

Similar to (73), now we shall estimate 

r-"e,p,_(^(l + e-)}rf., (80) 
Je„+ba„ la^ 

where c' = t or — c. It is obvious that the quantity in (80) is equal to 

(27r)^/2a„(l + e')-^/2[${a(l + e')^/^} - ${6(1 + t'f'^}]. 

Since we can make c arbitrarily small, therefore, using (76) and (77) we can finally 
obtain 

^ (27r)^/2n(^o){$(a)-^(6)} 



P„( A',,...,A'„|^n)(7„ 

in probability Pbq. 



B The Proof of Strong Convergence 

The proof of Theorem 3.2 is analogous to that of Theorem 3,1 and is also based 
on two lemmas and one corollary. However, these intermediate results are stronger 
than those used in proving Theorem 3.1. 
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Lemma B.l Under the assumptions of Lemma 3.1, for any given 6 > 0, there exists 
k{6) > 0 such that 

PeA'^ sup n-'lLr,{e)-Lr,{eo)]<-k{6)] = l. (81) 

\B-Bo\>6 

Proof: The proof of (81) analogous to that of ^.emma 3.1 except the following two 
changes: 

(1) replacing (39) by 

PeA'^ sup n-'iUe) - I„(do)l < -ci} = 1; (82) 

(2) replacing (41) by 

/iS"' = {h^ sup n-'lUe) - Lr^iOo)] < -Ci}. 

|fl-fl.|<6 

Now we only need to prove (82). Since 

Wn-^[L„(5.)-^n(i^o)] 
is measureable with respect to the tail a field 

by the Kolmogorov's 0 — 1 law (Billingsley, p295) it must be a ^nonrandom" 
constant with probability 1. Denote this constant as rj. According to (40), 

PeAv = Jmn-'ILM) - In(^o)l < < 0} = 1. 

Choose 

2 

and choose 6 small enough such that 

n 

n— ♦oo •' 
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(see (34) for the definition of i/j(6,di))> thus 

n 

sup n-'[Ue) - UOo)] < T5^n-MM^0-^n(M + H""'^^^'^*'^'^ 

< T? + e < -c{6i) almost surely. 



lim 

n-^oo \e-ei\<s 



Thus (82) holds. " 
Corollary B.l Lemma B.l ensures that 

" *n— »oo 

Proof: Analogous to that of Wald (1949) and omitted. ■ 

Lemma B.2 Under the assumptions of Lemma 3.2, for any t > 0, there exists S 
such that 

PeA~^ sup |/?„(A'i,...,X„,^)| <£} = !. (83) 
\e-eo\<s 

Proof: Analogous to that of Lemma 3.2 and omitted. ■ 

Proof of Theorem 3.2: Based on Lemma B.l, Lemma B.2 and Corrollary B.l. The 
basic steps are analogous to those of Theorem 3.1 and omitted. ■ 



C The Proof of Convergence in Manifest Proba- 
bility 

Proof of Theorem 3.3: Theorem 3.1 implies that for arbitrary 0 and arbitrary 
c> 0, 

Pe{\An{Xu---,Xn)-A\>t}-^0, 
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as n -> 00. Define 



Xn)-A\>e} 



U is clear that for any 6 and c > 0 that 



0<Hn{e,c)<l and \\mHn{6,e) = 0, 
By Lebesgue's bounded convergence theorem (Billingsley, p214), 



/ Hn{e,c)n{e)de o. 

«/ 0 



That is 



Xn)-A\>t} 



f p{\An{ Xu...,Xr,)-A\>c\e}u{e)de 

J 0 

/ Hn{e,e)U{e)d9 -4 0. 



This proves Theorem 3.3. 
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